The set cover problem is a classical question in combinatorics, computer science, operations research, and complexity theory.
Given a set of elements (henceforth referred to as the universe, specifying all possible elements under consideration) and a collection, referred to as , of a given subsets whose union equals the universe, the set cover problem is to identify a smallest sub-collection of whose union equals the universe.
For example, consider the universe, and the collection of sets In this example, is equal to 4, as there are four subsets that comprise this collection. The union of is equal to . However, we can cover all elements with only two sets: , see picture, but not with only one set. Therefore, the solution to the set cover problem for this and has size 2.
More formally, given a universe and a family of subsets of , a set cover is a subfamily of sets whose union is .
The decision version of set covering is NP-complete. It is one of Karp's 21 NP-complete problems shown to be NP-complete in 1972. The optimization/search version of set cover is NP-hard. It is a problem "whose study has led to the development of fundamental techniques for the entire field" of approximation algorithms.
In the fractional set cover problem, it is allowed to select fractions of sets, rather than entire sets. A fractional set cover is an assignment of a fraction (a number in 0,1) to each set in , such that for each element x in the universe, the sum of fractions of sets that contain x is at least 1. The goal is to find a fractional set cover in which the sum of fractions is as small as possible. Note that a (usual) set cover is equivalent to a fractional set cover in which all fractions are either 0 or 1; therefore, the size of the smallest fractional cover is at most the size of the smallest cover, but may be smaller. For example, consider the universe and the collection of sets The smallest set cover has a size of 2, e.g. But there is a fractional set cover of size 1.5, in which a 0.5 fraction of each set is taken.
| minimize | (minimize the number of sets) | ||
| subject to | for all | (cover every element of the universe) | |
| for all . | (every set is either in the set cover or not) |
Weighted set cover is described by a program identical to the one given above, except that the objective function to minimize is , where is the weight of set .
Fractional set cover is described by a program identical to the one given above, except that can be non-integer, so the last constraint is replaced by .
This linear program belongs to the more general class of LPs for , as all the coefficients in the objective function and both sides of the constraints are non-negative. The integrality gap of the ILP is at most (where is the size of the universe). It has been shown that its relaxation indeed gives a factor- approximation algorithm for the minimum set cover problem. See randomized rounding#setcover for a detailed explanation.
To show that the problems are equivalent, for a universe of size and collection of sets of size , construct and . Then a set cover of is equivalent to a hitting set of where , and vice versa.
This equivalence can also be visualized by representing the problem as a bipartite graph of vertices, with vertices on the left representing elements of , and vertices on the right representing elements of , and edges representing set membership (i.e., there is an edge between the -th vertex on the left and the -th vertex of the right iff. ). Then a set cover is a subset of right vertices such that each left vertex is adjacent to at least one member of , while a hitting set is a subset of left vertices such that each right vertex is adjacent to at least one member of . These definitions are exactly the same except that left and right are swapped. But there is nothing special about the sides in the bipartite graph; we could have put the elements of on the right side, and the elements of on the left side, creating a graph that is a mirror image of the one described above. This shows that set covers in the original graph are equivalent to hitting sets in the mirrored graph, and vice versa.
In the field of computational geometry, a hitting set for a collection of geometrical objects is also called a stabbing set or piercing set.
This greedy algorithm actually achieves an approximation ratio of where is the maximum cardinality set of . For dense instances, however, there exists a -approximation algorithm for every .
There is a standard example on which the greedy algorithm achieves an approximation ratio of . The universe consists of elements. The set system consists of pairwise disjoint sets with sizes respectively, as well as two additional disjoint sets , each of which contains half of the elements from each . On this input, the greedy algorithm takes the sets , in that order, while the optimal solution consists only of and . An example of such an input for is pictured on the right.
Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for set cover up to lower order terms (see Inapproximability results below), under plausible complexity assumptions. A tighter analysis for the greedy algorithm shows that the approximation ratio is exactly .Slavík Petr A tight analysis of the greedy algorithm for set cover. STOC'96, Pages 435-441,
If the constraint is replaced by for all in in the integer linear program shown above, then it becomes a (non-integer) linear program . The algorithm can be described as follows:
In low-frequency systems, proved it is NP-hard to approximate set cover to better than . If the Unique games conjecture is true, this can be improved to as proven by .
proves that set cover instances with sets of size at most cannot be approximated to a factor better than unless '''P''''''NP''', thus making the approximation of of the greedy algorithm essentially tight in this case.
|
|